" Unit 3 - Lecture 2 "
"------------------------------------------------------------------------"

" Working with Statistical Distributions "

" Key letter:
  d : PMF P(X = x) / PDF f(x)
  p : CDF = P[X <= x]
  q : Quantile
  r : Simulate a random sample "


"------------------------------------------------------------------------"

" Continuous Distributions "

"------------------------------------------------------------------------"

" 1.) Normal Distribution "

" 
Problem:
Visualize the PDF and CDF of Normal Distribution
for a fix value of Sigma and varying values
of Mu.

Similary,
Visualize the PDF and CDF of Normal Distribution
for a fix value of Mu and varying values
of Sigma.
"

sigma = 1
mu = 0

X = seq(-10,10,0.1)
PDF = dnorm(X,mu,sigma)

plot(X,
     PDF,
     type = "l")

mu = c(-2,-1,1,2)
COLOR = c("pink","red","orange","purple")

for(i in 1:length(mu)){
  
  PDF = dnorm(X,mu[i],sigma)
  
  lines(X,
        PDF,
        col = COLOR[i])
  
}




"------------------------------------------------------------------------"

"
Other Continuous Distributions:
- Exponential: iexp()
- Gamma: igamma()
- LogNormal: ilnorm()
- Beta: ibeta()
- Continuous Uniform: iunif()
- F: if()
- t: it()
- Chi-Sq: ichisq()

Where i = d, p, q, r

"

"------------------------------------------------------------------------"

"
Exam Question:

A random variable Y is assumed
to follow a Gamma Distribution 
with mean 4 and variance 8.

(a) Simulate 100 values from 
    the above distribution,
    using the seed as 1. 
    Show the empirical mean, median and mode.
    
(b) Compare your answers in part (a),
    which the population values for 
    the measures of central tendency.
"

"a."
alpha = 2
lambda = 0.5

set.seed(1)
Y = rgamma(100,alpha,lambda)
head(Y)

mean(Y)
median(Y)

K.D = density(Y)

plot(K.D$x,
     K.D$y,
     type = "l")

K.D$x[which.max(K.D$y)]


"b."

alpha / lambda

qgamma(0.5,alpha,lambda)

(alpha - 1) / lambda


X = seq(0,11,0.0001)
PDF = dgamma(X,alpha,lambda)

X[which.max(PDF)]



"Example:"

X = rnorm(1000)

K.D = density(X)
plot(K.D,
     type = "l",
     ylim = c(0,0.45))

lines(seq(-5,5,0.25),
      dnorm(seq(-5,5,0.25)),
      col = "red")


 "------------------------------------------------------------------------"

" Practise Question "

"
Can you confirm that,

Q = (n - 1)*S^2 / (Sigma^2) ~ Chi-sq(n - 1)


Assume:
1,000 samples,
each of size 20 coming from N(mu,sigma^2)
mu = 15
sigma = 3

set.seed(1)

a.)
Create the histogram on Q.

b.)
Superimpose on the histogram a graph of the
theoretical distribution.

c.)
Comment on how close our empirical distribution
is to theoretical distribution.
"
n = 20
mu = 15
sigma = 3
Q = c()

set.seed(1)

for(i in 1:1000){
  
  X = rnorm(n,mu,sigma)
  Q[i] = (n - 1)*var(X) / (sigma^2)
  
}

hist(Q,freq = F,
     ylim = c(0,0.07),
     breaks = 100)

X = seq(0,60,0.025)
PDF = dchisq(X,n - 1)


lines(density(Q),col = "red")


plot(density(Q))

lines(X,
      PDF,
      col = "red")




"------------------------------------------------------------------------"

"
Concept:
Understand QQ-Plot

"

"------------------------------------------------------------------------"

